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ABSTRACT 

Combining measurements which have "theoretical uncertainties" is a del- 
icate matter, due to an unclear statistical basis. We present an algorithm 
based on the notion that a theoretical uncertainty represents an estimate of 
bias. 

1 Introduction 

Combining measurements which have "theoretical uncertainties" is a deli- 
cate matter. Indeed, we are often in the position in which sufficient informa- 
tion is not provided to form a completely principled combination. Here, we 
develop a procedure based on the interpretation of theoretical uncertainties 
as estimates of bias. We compare and contrast this with a procedure that 
treats theoretical uncertainties along the same lines as statistical uncertain- 
ties. 

Suppose we are given two measurements, with results expressed in the 
form: 

A ± a A ±t A 

B ± a B ±t B . (1) 

Assume that A has been sampled from a probability distribution of the form 
Pa{A; A, a a), where A is the mean of the distribution and a a is the standard 
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deviation. We make the corresponding assumption for B. The t& and t B 
uncertainties in Eq. Q]are the theoretical uncertainties. We may not need to 
know exactly what that means here, except that the same meaning should 
hold for both tA and t B . We suppose that both A and B are measurements 
of the same quantity of physical interest, though possibly with quite different 
approaches. The question is: How do we combine our two measurements? 

Let the physical quantity we are trying to learn about be denoted 9. 
Given the two results A and B, we wish to form an estimator, 9 for 9, 
with "statistical" and "theoretical" uncertainties expressed separately in 
the form: 

9±a±t. (2) 

The quantities 9, a, and t are to be computed in terms of A, B,aA,cr B ,tA, 
and t B . 



2 Forming the Weighted Average 

In the absence of theoretical uncertainties, we would normally combine our 
measurements according to the weighted average: 

A , B 

* - °% °% J- 1 (3) 



'A B 

For clarity, we are assuming for now that there is no statistical correlation 
between the measurements; such correlations will be incorporated later. 
In general, A and B will be biased estimators for 9: 

A = 9 + b A 

B = 9 + b B , (4) 

where 6^4 and bs are the biases. We adopt the point of view that the 
theoretical uncertainties tA and ts are estimates related to the possible 
magnitudes of these biases. That is, 

t A ~ IM 

t B ~ IM- (5) 

We wish to have t represent a similar notion. 

Without yet specifying the weights, assume that we continue to form 9 
as a weighted average of A and B: 

~ w A A + w B B 

6 = : , 6 

WA + W B 
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where wa and wb are the non-negative weights. The statistical error on the 
weighted average is computed according to simple error propagation on the 
individual statistical errors: 

2 w\a\ + w 2 B a B 
(wa + wb) 

The bias for 8 is: 

b = (e-9) = WAbA+WBbB . (8) 

X ' WA+WB 

If the theoretical uncertainties are regarded as estimates of the biases, then 
the theoretical uncertainty should be evaluated with the same weighting: 

_ WAtA + W B tB 

1 — ; > l y J 

WA + W B 

It may be noted that this posesses desirable behavior in the limit where the 
theoretical uncertainties are identical (completely correlated) between the 
two measurements: The theoretical uncertainty on V is in this case the same 
as tA = ie; no reduction is attained by having multiple measurements. 

However, it is not quite true that the theoretical uncertainties are being 
regarded as estimates of bias. The tA and ts provide only estimates for the 
magnitudes, not the signs, of the biases. Eq. [9] holds when the biases are of 
the same sign. If the biases are opposite sign, then we obtain 

t _ \wAtA ~ W B tB\ 
wa + w b 

Thus, our formula [9] breaks down in some cases. For example, suppose 
the theoretical uncertainties are completely anticorrelated. In the case of 
equal weights, the combined theoretical uncertainty should be zero, because 
the two uncertainties are exactly canceled in the combined result. Only a 
statistical uncertainty remains. 

Unfortunately, we don't always know whether the biases are expected to 
have the same sign or opposite sign. As a default, we adopt the procedure of 
Eq. [9J In the case of similar measurements, we suspect that the sign of the 
bias will often have the same sign, in which case we make the right choice. 
In the case of quite different measurements, such as inclusive and exclusive 
measurements of V u b, there is no particular reason to favor either relative 
sign; we simply don't know. The adopted procedure has the property that 
it errs on the side of "conservatism" - we will sometimes overestimate the 
theoretical uncertainty on the combined result. 
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There is still a further issue. The results of the measurements themselves 
can provide information on what the theoretical uncertainty could be. Con- 
sider two measurements with negligible statistical uncertainty. Then the 
difference between the two measurements is the difference between the bi- 
ases. If the measurements are far apart, on the scale of the theoretical 
uncertainties, then this is evidence that the theoretical uncertainties are of 
opposite sign. We make no attempt to incorporate this information, again 
erring on the conservative side. 

We turn to the question of choice of weights wa and wb- In the limit of 
negligible theoretical uncertainties we want to have 

WA = —o (11) 
°A 

w B = \- (12) 

a B 

Using these as the weights in the presence of theoretical uncertanties can 
lead to undesirable behavior. For example, suppose t& 3> ts and a a "C ob- 
The central value computed with only the statistical weights ignores the 
theoretical uncertainty. A measurement with small theoretical uncertainty 
may be given little weight compared to a measurement with very large the- 
oretical uncertainty. While not "wrong", this does not make optimal use 
of the available information. We may invent a weighting scheme which in- 
corporates both the statistical and theoretical uncertainties, for example 
combining them in quadrature: 



w' A 
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w 'b = -JT—7T- ( 13 ) 
a B "T" l B 

Any such scheme can lead to unattractive dependence on the way mea- 
surements may be associatively combined. In order to have associativity in 
combining three measurements A, B, C, we must have that the weight for 
the combination of any two to be equal to the sum of the weights for those 
two, e.g., wab = wa + wb- This is inconsistent with our other requirements. 
We shall adopt the procedure in Eq. [T3l with the understanding that it is 
best to go back to the original measurements when combining several results, 
rather than making successive combinations. 
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3 Inconsistent Inputs 



It may happen that our measurements are far enough apart that they appear 
inconsistent in terms of the quoted uncertainties. Our primary goal may be 
to test consistency between available data and a model, including whatever 
theoretical uncertainties exist in the comparison. We prefer to avoid mak- 
ing erroneous claims of inconsistency, even at the cost of some statistical 
power. Thus, we presume that when two measurements of what is assumed 
to be the same quantity appear inconsistent, something is wrong with the 
measurement or with the thoeretical uncertainties in the computation. If 
we have no good way to determine in detail where the fault lies, we adopt a 
method similar to that used by the Particle Data Group (PDG) [T| to enlarge 
the stated uncertainties. 

Given our two measurements as discussed above, we define the quantity: 



In the limit of purely statistical and normal errors, this quantity is dis- 
tributed according to a chi-square with one degree of freedom. In the 
more general situation here, we don't know the detailed properties, but 
we nonetheless use it as a measure of the consistency of the results, in the 
belief that the procedure we adopt will still tend to err toward conservatism. 

If X 2 — I? the measurements are deemed consistent. On the other hand, 
if X 2 > 1, we call the measurements inconsistent, and apply a scale factor 
to the errors in order to obtain x 2 = 1- We take the point of view that we 
don't know which measurement (or both) is flawed, or whether the problem 
is with the statistical or theoretical error evaluation. If we did have such 
relevant information, we could use that in a more informed procedure. Thus, 
we scale all of the errors (a a, ctb, tA, ^b) by a factor: 



This scaling does not change the central value of the averaged result, but 
does scale the statistical and theoretical uncertainties by the same factor. 

4 Relative Errors 

We often are faced with the situation in which the uncertainties are rela- 
tive, rather than absolute. In this case, the model in which 6 is a location 
parameter of a Gaussian distribution breaks down. However, it may be a 




(14) 




(15) 
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reasonable approximation to continue to think in terms this model, with 
some modification to mitigate bias. We also continue to work in the con- 
text of a least-squares minimization, although it might be interesting to 
investigate a maximum likelihood approach. 

Thus, suppose we have additional experimental uncertainties sa and sb, 
which scale with 9: 

sa = r A 0, 

s B = r B 0. (16) 

If Sk is what we are given, we infer the proportionality constants according 
to ta = sa/A and re = sb/B. 

The weights that are given in Eqn. [13] are modified to incorporate this 
new source of uncertainty according to: 



w' A 
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o\ + {r A e? + t\ 



w 'b = —o T—JTo — (17) 

a% + (r B 0) 2 + t% 

Note that, as we don't know 9, we use 9 instead. This means that the 
averaging process is now iterative, until convergence to a particular value of 
9 is obtained. 

Likewise, there may be a theoretical uncertainty which scales with 9, and 
we may treat this similarly. Thus, suppose that, for example, t\ = t^ A +t^. A , 
where t a A is an absolute uncertainty, and t T A = We simply replace 9 
by 9 and substitute this expression wherever tj± appears, e.g., in Eqn. [T71 
That is: 

w' A = ^ ^. (18) 

5 Summary of Algorithm 

We summarize the proposed algorithm, now including possible statistical 
correlations: Suppose we have n measurements {xi\i = 1,2, ... ,n} with 
covariance matrix 

My = {{ Xi - (xi))( Xj - ( Xj ))), (19) 

and mean values 

(x i ) = 9 + b i . (20) 
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Note that, in the non-correlated case, My = <J 2 8ij, or including relative 
uncertainties, M y - = 8ij(af + r 2 (xi) 2 ). The parameter we are trying to learn 
about is 9, and the b% is the bias that is being estimated with theoretical 
uncertainties U. 

The present notion of the weighted average is that we find a 9 which 
minimizes: 

x 2 = E 0* - e ) w a ( x i ~ e ) ■ ( 21 ) 

This is based on the premise that we don't actually know what the biases 
are, and we do the minimization with zero bias in the (x — 9) dependence. 
The possible size of bias is taken into account in the weighting, giving more 
weight to those measurements in which the size of the bias is likely to be 
smaller. 

The "weight matrix" W in principle could be taken to be: 

(W- l ) ij = M ij + t i t j . (22) 

That is, W~ l is an estimate for 

(Or; - 9){ Xj -9)) = Mij + bibj. (23) 

However, we don't assume that we know the relative signs of bi and bj. 
Hence, the off-diagonal t{tj term in Eqn. [22] could just as likely enter with a 
minus sign. We therefore use the weight matrix: 

(W-% = M i:j + tfdij. (24) 

If we do know the relative signs of the biases, for example because the the- 
oretical uncertainties are correlated, then the off-diagonal terms in Eqn. [22] 
should be included, with the appropriate sign. When using the term "corre- 
lated" with theoretical uncertainties, it should be kept in mind that it does 
not necessarily have a statistical interpretation. 

Setting dx 2 /d9\ 0=i j = gives the central value ("best" estimate): 



(25) 



The statistical uncertainty is 



1 ~ T~W (26) 
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Note that this reduces to 



a = 1 (27) 



in the case of only statistical uncertainites. The theoretical uncertainty is 

(28) 



where 



tj = \l%j + (PjW- ( 29 ) 
Finally, if x 2 > n — 1, these error estimates are scaled by a factor: 



where x 2 here is the value after the minimization. 

6 Comparison with treating theoretical uncertain- 
ties on same footing as statistical 

Another approach to the present problem is to simply treat the theoretical 
uncertainties as if they were statistical. [2] This procedure gives the same 
estimator as above for 9. However, the results for statistical and theoretical 
uncertainties differ in general. 

Let a' be the estimated statistical uncertainty on the average for this 
approach, and let t be the estimated theoretical uncertainty. Also, let Tij 
be the "covariance matrix" for the theoretical uncertainties in this picture. 
Then the statistical and theoretical uncertainties on the average are given 
by: 



a' 



t' 



(31) 
(32) 



Note that the weights are given, as before, by 

Wij = (M + T)^j l . (33) 
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That is, the weights are the same as the treatment earlier, if the same 
assumptions about theoretical correlations are made in both places. 

The estimates for the statistical and theoretical uncertainties differ be- 
tween the two methods. That is, in general, a' / a and t' / t. 

The statistical uncertainty a is computed from the individual statis- 
tical uncertainties according to simple error propagation. The statistical 
uncertainty a' is evaluated by identifying a piece of the overall quadratic 
combination of statistical and theoretical uncertainties as "statistical". 

The difference between t and if is that t is computed as a weighted 
average of the individual t's, while t' is evaluated by identifying a piece of the 
overall quadratic combination of statistical and theoretical uncertainties as 
"theoretical" . The approach for t is based on the notion that the theoretical 
uncertainties are estimates of bias, but with a conservative treatment of any 
unknown correlations. The t' approach may be appropriate if the theoretical 
uncertainties are given a probablistic interpretation. 

Let's consider some possible special cases. Suppose that all of the Vs 
are the same, equal to ti, and suppose that the theory uncertainties are 
presumed to be "uncorrected" . In this case, 



Which is more reasonable? That depends on how we view the meaning of 
"uncorrelated" in our assumption, and on whether we assign a probabilistic 
interpretation to the theoretical uncertainties. If we are supposing that the 
acutal theoretical uncertainties are somehow randomly distributed in sign 
and magnitude, then it is reasonable to expect that the result will become 
more reliable as more numbers are averaged. However, if we consider the 
theoretical uncertainties as estimates of bias, which could in fact all have the 
same sign, then the weighted linear average is plausible. It is at least a more 
conservative approach in the absence of real information on the correlations. 

Note that if the correlation in theoretical uncertainty is actually known, 
the weighted linear average will take that into account. For example, sup- 
pose there are just two measurements, with ti = —t±. If the weights are the 
same (that is, we also have a\ = 02) then t = 0. The other approach also 
gives t' = 0. 



t = 



h 

ti/y/n. 



(34) 
(35) 



A different illustrative case is when ti 
In this case, we find 



0, ti / 0, and M 





(36) 
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a = 



2 + ^2 



1 

^2 



t' 




(37) 
(38) 

(39) 
(40) 



To understand the difference better, consider the limit in which t\ S> 01,02: 

2 

= 9' = x 1 2 °f 2 +x 2 ~x 2 , (41) 
a l + n 

(42) 



0" = 



02, 



0, 



a = 



t' = 



(43) 
(44) 
(45) 



In this limit, both methods agree that the important information is in x 2 - 
The first method assigns a statistical error corresponding to the statistical 
uncertainty of x 2 , and a theoretical uncertainty of zero, reflecting the zero 
theoretical uncertainty on the x 2 measurement. The second method, how- 
ever, assigns equal statistical and theoretical uncertainites to the average. 
Their sum in quadrature is a plausible expression of the total uncertainty, 
but the breakdown into theoretical and statistical components is not rea- 
sonable in the second method. 

Another limit we can take in this example is o\ <C t\ <C a 2 , obtaining: 






= 6' = 


( t 2 l 

xi + x 2 - 2 - 








a 


= (?\ 




t 


= h, 






h 




a' 






t' 


h 






V2' 





Xl, 



(46) 

(47) 
(48) 

(49) 
(50) 
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Similar observations may be made in this case as in the previous. 

7 Conclusions 

We suggest a procedure to treat theoretical uncertainties when combining 
measurements of a quantity. It must be emphasized that, at least without 
more information about the nature of the theoretical uncertainty, there is 
no rigorous procedure, e.g., in the context of frequencies. The interpretation 
we adopt is that theoretical errors represent estimates of bias. This leads to 
a straightforward algorithm. If the sign of the bias is not specified (the usual 
situation), the procedure is designed to be "conservative", in the sense that 
we may err on the side of overstating theory uncertainties on the combined 
result. There is also some arbitrariness in construction of the weights from 
the statistical and theoretical uncertainties; we suggest simply adding the 
uncertainties in quadrature. 

Our procedure is compared with a method that treats theoretical uncer- 
tainties as if they are of statistical origin. While there are some reassuring 
similarities, there are also differences. For example, the two procedures lead 
to different scaling behavior for the theoretical uncertainty with the num- 
ber of measurements. Depending on interpretation, either result could be 
regarded as reasonable; our procedure yields a more "conservative" scaling. 
Our procedure also does a better job of keeping meaningful separation be- 
tween statistical and theoretical uncertainties as results are combined. In 
the case where there are no theoretical uncertainties, both procedures yield 
identical, conventional results. It may be useful in practice to compute the 
uncertainties via both approaches, giving an idea for how sensitive the result 
is to the assumptions. 

It would, of course, be nice to have a test of the procedure. For example, 
does it lead to appropriate frequency coverage? Given the lack of clarity (and 
perhaps lack of consistency) in what theoretical uncertainties really mean, 
a meaningful test is difficult. We thus rely on the conservative nature of our 
procedure - the intervals obtained for the combined results will "probably" 
over-cover if theoretical uncertainties are present. 
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